Prosodic analysis of disfluent events in a corpus of university lectures
نویسندگان
چکیده
1 INESC-ID, 2 FLUL & 3 IST This paper describes our efforts towards the analysis of the prosodic properties (pitch, energy, and duration) of disfluencies, aiming both at a view of their global properties, and also at an analysis of their idiosyncratic behaviors. Underlying this task is the fact that disfluencies, e.g., filled pauses, prolongations, repetitions, substitutions, deletions, insertions, characterize spontaneous speech and play a major role in speech structuring [l]. For speech processing, the analysis of the regular patterns of those phenomena is crucial [2,3]. In automatic speech recognition (ASR), their identification accounts for more robust language and acoustic models [4] and even in text to speech synthesis (TTS), they are being modeled to improve the naturalness of synthetic speech [5]. Moreover, when combining ASR and TTS with machine translation systems, spontaneous speech translation still needs substantial improvements [6]. In our previous work for European Portuguese [7], we proposed that prosodic properties, mainly prosodic phrasing and contour shape, are essential to perform an evaluation task regarding fluency/disfluency distinctions. In this perspective, disfluencies may behave and even be rated as fluent communicative devices, when different segmental and suprasegmental aspects are monitored. We now aim at extending our study to a characterization of the prosodic parameters of disfluencies in a more quantifiable way. Specifically, our main goal is to verify if disfluent events have distinct prosodic properties. This work uses a subset of the LECTRA corpus [8], collected with the goal of transcribing university lectures for e-learning applications. The corpus has a total of 74h, of which 10h were multilayer annotated, including (besides other information) an orthographic tier, a morpho-syntactic tier, and a disfluency tier, annotated accordingly to [2]. This small subset corresponds to 5 speakers (the lecturers in each of the recorded courses). Table 1 shows the distribution of the different disfluencies for each of the speakers. al eti iou oop pmc Total
منابع مشابه
Comparing Different Machine Learning Approaches for Disfluency Structure Detection in a Corpus of University Lectures∗
This paper presents a number of experiments focusing on assessing the performance of different machine learning methods on the identification of disfluencies and their distinct structural regions over speech data. Several machine learning methods have been applied, namely Naive Bayes, Logistic Regression, Classification and Regression Trees (CARTs), J48 and Multilayer Perceptron. Our experiment...
متن کاملDisfluency detection based on prosodic features for university lectures
This paper focuses on the identification of disfluent sequences and their distinct structural regions, based on acoustic and prosodic features. Reported experiments are based on a corpus of university lectures in European Portuguese, with roughly 32h, and a relatively high percentage of disfluencies (7.6%). The set of features automatically extracted from the corpus proved to be discriminant of...
متن کاملClassification of disfluent phenomena as fluent communicative devices in specific prosodic contexts
This work explores prosodic cues of disfluent phenomena. In our previous work, we conducted a perceptual experiment regarding (dis)fluency ratings. Results suggested that some disfluencies may be considered felicitous by listeners, namely filled pauses and prolongations. In an attempt to discriminate which linguistic features are more salient in the classification of disfluencies as either flue...
متن کاملAnalysis of disfluencies in a corpus of university lectures
This paper analyzes the prosodic properties of disfluencies and of their contexts in a corpus of university lectures. Results show that there is a general tendency to repair fluency by means of prosodic contrast marking strategies (pitch and energy increase), regardless of the specific disfluency type, but still there are degrees in the contrast made by certain types. As for tempo patterns, the...
متن کاملProsodic classification of discourse markers
The first contribution of this study is the description of the prosodic behavior of discourse markers present in two speech corpora of European Portuguese (EP) in different domains (university lectures, and map-task dialogues). The second contribution is a multiclass classification to verify, given their prosodic features, which words in both corpora are classified as discourse markers, which a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010